Comparing Expert and Metric-Based Assessments of Association Rule Interestingness
نویسندگان
چکیده
In association rule mining, interestingness refers to metrics that are applied to select association rules, beyond support and confidence. For example, Merceron & Yacef (2008) recommend that researchers use a combination of lift and cosine to select association rules, after first filtering out rules with low support and confidence. However, the empirical basis for considering these specific metrics to be evidence of interestingness is rather weak. In this study, we examine these metrics by distilling association rules from real educational data relevant to established research questions in the areas of affect and disengagenment. We then ask three domain experts to rate the interestingness of the resultant rules. We finally analyze the data to determine which metric(s) best agree with expert judgments of interestingness. We find that Merceron & Yacef (2008) were right. Lift and cosine are good indicators of interestingness. In addition, the Phi Coefficient, Convinction, and Jaccard also turn out to be good indicators of interestingness.
منابع مشابه
Numeric Multi-Objective Rule Mining Using Simulated Annealing Algorithm
Abstract as a single objective one. Measures like support, confidence and other interestingness criteria which are used for evaluating a rule, can be thought of as different objectives of association rule mining problem. Support count is the number of records, which satisfies all the conditions that exist in the rule. This objective represents the accuracy of the rules extracted from the da...
متن کاملBoolean Analyzer - An Algorithm That Uses A Probabilistic Interestingness Measure to find Dependency/Association Rules In A He
A new, binary-based technique is presented for finding dependency/association rules called the Boolean Analyzer (BA). With initial guidance from a domain user or domain expert, BA is given one or more metrics to partition the entire data set. This leads to analyzing the implicit domain knowledge and creating weighted rules in the form of boolean expressions. To augment the analysis of the rules...
متن کاملOn selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid
Data mining algorithms, especially those used for unsupervised learning, generate a large quantity of rules. In particular this applies to the Apriori family of algorithms for the determination of association rules. It is hence impossible for an expert in the field being mined to sustain these rules. To help carry out the task, many measures which evaluate the interestingness of rules have been...
متن کاملModeling interestingness of streaming association rules as a benefit-maximizing classification problem
0950-7051/$ see front matter 2008 Elsevier B.V. A doi:10.1016/j.knosys.2008.07.003 q The authors gratefully acknowledge the TUBITA Research Council of Turkey) for providing funds to Grants 101E044 and 105E065. * Corresponding author. E-mail address: [email protected] (T. Aydın) In a typical application of association rule learning from market basket data, a set of transactions for a fixe...
متن کاملRanking discovered rules from data mining with multiple criteria by data envelopment analysis
In data mining applications, it is important to develop evaluation methods for selecting quality and profitable rules. This paper utilizes a non-parametric approach, Data Envelopment Analysis (DEA), to estimate and rank the efficiency of association rules with multiple criteria. The interestingness of association rules is conventionally measured based on support and confidence. For specific app...
متن کامل